fix(db): reuse ReadOptions in ColumnDbSnapshot to reduce GC pressure#10894
Conversation
|
@claude review |
|
Claude finished @smartprogrammer93's task in 2m 5s —— View job Code ReviewGood optimization overall — the profiling data clearly justifies the changes, and the approach of eagerly destroying native handles + suppressing finalizers is sound. Below are the issues I found. Critical1. int ord = Unsafe.As<T, int>(ref k);
Use 2. No double-dispose guard in public void Dispose()
{
Native.Instance.rocksdb_readoptions_destroy(_sharedReadOptions.Handle);
GC.SuppressFinalize(_sharedReadOptions);
Native.Instance.rocksdb_readoptions_destroy(_sharedCacheMissReadOptions.Handle);
GC.SuppressFinalize(_sharedCacheMissReadOptions);
_snapshot.Dispose();
}If Moderate3. Data race on T[] keys = columnsDb._cachedColumnKeys ??= columnsDb._columnDbs.Keys.ToArray();
if (columnsDb._cachedMaxOrdinal < 0)
{
...
columnsDb._cachedMaxOrdinal = max;
}
4. Missing regression test [ Per project guidelines: "When fixing a bug, always add a regression test." The PR notes no tests were written. Even a simple unit test verifying that Minor5. // Single shared delegate for GetViewBetween — avoids per-reader closure allocation
Func<ReadOptions> readOptionsFactory = () => { ... };The delegate is one allocation (good), but each call to 6. LINQ in constructor [ columnsDb._columnDbs.Keys.ToArray()The coding style says "No LINQ when a simple SummaryThe optimization is well-motivated and the profiling data is compelling. The two critical issues (unsafe enum cast, missing double-dispose guard) should be addressed before merge. The thread-safety concern on the parent-level cache is also worth fixing since |
|
@claude rereview |
|
Claude finished @smartprogrammer93's task in 1m 38s —— View job Re-reviewBoth critical issues from the previous review have been addressed. Here's the full updated status. Previously Critical — Now Fixed1. ✅ All three call sites now use 2. ✅ Double-dispose guard added [ if (_disposed) return;
_disposed = true;Clean and correct. Note: per coding-style, Previously Moderate — Addressed with Acknowledged Trade-off3. Both fields are now
This is acceptable. Previously Minor — Fixed4. ✅ LINQ removed from constructor [ Replaced with a manual 5. ✅ Clarifying comment on The comment now explicitly calls out that per-call Still Open6. Missing regression test Per project guidelines ( New Observation7. public IReadOnlyKeyValueStore GetColumn(T key)
{
return _readers[Convert.ToInt32(key)];
}After SummaryThe two blockers from the first review are resolved. The remaining items are:
The optimization itself is solid and the profiling data clearly justifies it. Good to merge once the test is added. |
LukaszRozmej
left a comment
There was a problem hiding this comment.
Use helper methods to provide better context - naming what you are doing, will make code more self-documenting
| public void Dispose() | ||
| { | ||
| snapshot.Dispose(); | ||
| if (_disposed) return; |
There was a problem hiding this comment.
This doesn't need to be threadsafe?
We could do - although would need to by byte not bool
| if (_disposed) return; | |
| bool disposed = Interlocked.Exchange(ref _disposed, true); | |
| if (disposed) return; |
| public IReadOnlyKeyValueStore GetColumn(T key) | ||
| { | ||
| return _columnDbs[key]; | ||
| return _readers[Convert.ToInt32(key)]; |
There was a problem hiding this comment.
can be single line with =>
| _sharedReadOptions = new ReadOptions(); | ||
| _sharedReadOptions.SetVerifyChecksums(columnsDb.VerifyChecksum); | ||
| _sharedReadOptions.SetSnapshot(snapshot); |
| _sharedCacheMissReadOptions = new ReadOptions(); | ||
| _sharedCacheMissReadOptions.SetVerifyChecksums(columnsDb.VerifyChecksum); | ||
| _sharedCacheMissReadOptions.SetSnapshot(snapshot); | ||
| _sharedCacheMissReadOptions.SetFillCache(false); |
| T[]? keys = columnsDb._cachedColumnKeys; | ||
| if (keys is null) | ||
| { | ||
| IDictionary<T, ColumnDb> columnDbs = columnsDb._columnDbs; | ||
| keys = new T[columnDbs.Count]; | ||
| int idx = 0; | ||
| foreach (T key in columnDbs.Keys) | ||
| keys[idx++] = key; | ||
| columnsDb._cachedColumnKeys = keys; | ||
| } |
| if (columnsDb._cachedMaxOrdinal < 0) | ||
| { | ||
| int max = 0; | ||
| for (int i = 0; i < keys.Length; i++) | ||
| max = Math.Max(max, Convert.ToInt32(keys[i])); | ||
| columnsDb._cachedMaxOrdinal = max; | ||
| } |
| _readers = new RocksDbReader[columnsDb._cachedMaxOrdinal + 1]; | ||
| for (int i = 0; i < keys.Length; i++) | ||
| { | ||
| T k = keys[i]; | ||
| _readers[Convert.ToInt32(k)] = new RocksDbReader( | ||
| columnsDb, | ||
| _sharedReadOptions, | ||
| _sharedCacheMissReadOptions, | ||
| readOptionsFactory, | ||
| columnFamily: columnsDb._columnDbs[k]._columnFamily); | ||
| } |
|
@claude review |
|
Claude finished @smartprogrammer93's task in 2m 2s —— View job Re-review (Final)All previous critical and moderate issues have been resolved. LukaszRozmej's review suggestions are also addressed. Here is the full updated status. Previously Critical — Fixed ✅
LukaszRozmej's Suggestions — All Addressed ✅
Minor FindingMissing readonly ReadOptions _options;
readonly ReadOptions _hintCacheMissOptions;All other fields in the class (lines 20-23) have explicit Still Open
Missing regression test — Per SummaryThe optimization is well-implemented and the profiling data clearly justifies it. The PR has been approved by LukaszRozmej. The two open items ( |
…zer pressure ColumnDbSnapshot previously created 14 ReadOptions per snapshot (2 per column × 7 columns). ReadOptions in RocksDbSharp has a finalizer but no IDisposable, so these objects survive Gen0, get promoted to Gen1/Gen2, and cause expensive GC pauses. In FlatState block processing, 2000+ snapshots per BDN round produce 28,000+ finalizable objects, causing Gen1 (9/round) and Gen2 (1-2/round) collections that create timing spikes. Fix: Create 2 shared ReadOptions (normal + cache-miss) per ColumnDbSnapshot and pass them to all RocksDbReader instances. Explicitly destroy the native handles via rocksdb_readoptions_destroy + GC.SuppressFinalize in Dispose(). This reduces finalizable objects from 14 to 0 per snapshot scope. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…egate ColumnDbSnapshot used Dictionary<T, IReadOnlyKeyValueStore> to map columns to readers, plus 7 separate Func<ReadOptions> closure allocations. Replace with a flat array indexed by enum ordinal and a single shared delegate. This eliminates per-snapshot: 1 Dictionary + backing array + 6 extra delegate allocations (from 7 down to 1 shared). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace Unsafe.As<T, int> with Convert.ToInt32 for safe enum-to-int conversion regardless of underlying type (critical: buffer overread) - Add _disposed guard to Dispose() to prevent double-free on native ReadOptions handles (critical: use-after-free) - Use volatile fields for _cachedColumnKeys/_cachedMaxOrdinal to ensure cross-thread visibility (moderate: data race) - Replace LINQ ToArray() with manual loop (minor: style) - Clarify comment on readOptionsFactory noting GetViewBetween still creates per-call ReadOptions (minor: documentation) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ff0933e to
4c697c3
Compare
…sion tests Add ObjectDisposedException guard in ColumnDbSnapshot.GetColumn() to prevent use-after-free on destroyed native ReadOptions handles. Add explicit private modifier to RocksDbReader fields for style consistency. Add regression tests for idempotent double-dispose and post-dispose GetColumn behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude review |
|
Claude finished @smartprogrammer93's task in 1m 8s —— View job Re-review (Final Pass)
All previously raised issues have been resolved. Here is the complete status. Previously Critical — Fixed ✅1. private static int EnumToInt(T value) => Convert.ToInt32(value);All three call sites use this helper. Correct for all enum underlying types. 2. Double-dispose guard [ if (Interlocked.Exchange(ref _disposed, 1) != 0) return;Clean and correct. Previously Moderate — Fixed ✅3. Thread safety on Both fields are Previously Minor — Fixed ✅4. Missing private readonly ReadOptions _options;
private readonly ReadOptions _hintCacheMissOptions;Now matches the rest of the class. 5. Missing regression tests [ Two new tests added:
Satisfies the project's "always add a regression test" requirement from 6. ObjectDisposedException.ThrowIf(_disposed != 0, this);Guards against use-after-free on destroyed native Informational (Pre-existing, Not a Blocker)
SummaryAll critical, moderate, and minor issues from previous reviews are resolved. Tests cover both the double-dispose fix and the dispose guard on |
There was a problem hiding this comment.
Pull request overview
This PR optimizes RocksDB column snapshots in Nethermind.Db.Rocks to reduce allocation/finalizer pressure during high-frequency snapshot creation (notably in FlatState processing).
Changes:
- Reuse a shared pair of
ReadOptionsacross all column readers withinColumnsDb<T>.ColumnDbSnapshot, and explicitly destroy their native handles on dispose. - Replace per-snapshot
Dictionary<TKey, IReadOnlyKeyValueStore>with a flat array indexed by enum ordinal, and cache column keys/max ordinal on the parentColumnsDb<T>. - Add snapshot lifecycle tests (double-dispose + GetColumn-after-dispose) and a new
RocksDbReaderconstructor that accepts pre-createdReadOptions.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/Nethermind/Nethermind.Db.Test/ColumnsDbTests.cs |
Adds tests covering snapshot dispose behavior (idempotent dispose + disposed-guard). |
src/Nethermind/Nethermind.Db.Rocks/RocksDbReader.cs |
Adds a constructor overload to allow shared ReadOptions injection for snapshot readers. |
src/Nethermind/Nethermind.Db.Rocks/ColumnsDb.cs |
Implements shared ReadOptions, explicit native destroy, cached key metadata, and array-based column lookup in snapshots. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private static void DestroyReadOptions(ReadOptions options) | ||
| { | ||
| Native.Instance.rocksdb_readoptions_destroy(options.Handle); | ||
| GC.SuppressFinalize(options); |
There was a problem hiding this comment.
DestroyReadOptions calls Native.Instance.rocksdb_readoptions_destroy(...) directly, bypassing the _rocksDbNative instance used throughout DbOnTheRocks/ColumnsDb for native interop (and which can be injected). For consistency and to preserve the ability to use a non-default Native implementation, consider using the owning DB's _rocksDbNative (e.g., capture it in ColumnDbSnapshot and use that here).
| { | ||
| return _columnDbs[key]; | ||
| ObjectDisposedException.ThrowIf(_disposed != 0, this); | ||
| return _readers[EnumToInt(key)]; |
There was a problem hiding this comment.
GetColumn now indexes directly into _readers without any bounds / null checks. If a caller passes an enum value that wasn't part of the configured columns (or the enum is sparse), this will either return null (leading to later NREs) or throw IndexOutOfRangeException, whereas the previous dictionary implementation would reliably throw a key-missing exception. Consider validating the computed ordinal and that the slot is populated, and throw a consistent exception (e.g., KeyNotFoundException/ArgumentOutOfRangeException) when the column isn't present.
| return _readers[EnumToInt(key)]; | |
| int ordinal = EnumToInt(key); | |
| if ((uint)ordinal >= (uint)_readers.Length || _readers[ordinal] is null) | |
| { | |
| throw new KeyNotFoundException($"Column '{key}' is not configured."); | |
| } | |
| return _readers[ordinal]; |
| GC.SuppressFinalize(options); | ||
| } | ||
|
|
||
| private static int EnumToInt(T value) => Convert.ToInt32(value); |
There was a problem hiding this comment.
EnumToInt uses Convert.ToInt32(value) on a generic enum T. This binds to the Convert.ToInt32(object) overload, which boxes the enum value and can introduce per-call allocations in a hot path (constructor loop + every GetColumn). Since the goal of this PR is to reduce GC pressure, consider switching to a non-boxing conversion (e.g., a cached converter per T based on the enum underlying type, or constrain/assume int underlying for column enums and use a direct cast/Unsafe conversion).
| private static int EnumToInt(T value) => Convert.ToInt32(value); | |
| private static int EnumToInt(T value) => value.ToInt32(); |
- Use captured _rocksDbNative instead of Native.Instance in DestroyReadOptions for consistency with injectable native interop - Add bounds/null check in GetColumn to throw KeyNotFoundException (matching prior Dictionary behavior) instead of IndexOutOfRangeException/NRE - Replace boxing Convert.ToInt32(object) with Unsafe.As<T, int> in EnumToInt to avoid per-call allocation on the read path Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude review |
|
Claude finished @smartprogrammer93's task in 1m 57s —— View job Re-review
Previously Resolved — Status UpdateAll items from the three previous review rounds remain addressed:
Regression:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Safety: all column enums in this codebase use int as underlying type. | ||
| // Unsafe.As avoids the boxing that Convert.ToInt32(object) would cause. | ||
| private static int EnumToInt(T value) => Unsafe.As<T, int>(ref value); |
There was a problem hiding this comment.
EnumToInt uses Unsafe.As<T,int> which is only correct when the enum underlying type is int. ColumnsDb<T> is generic over any enum, so if a caller ever uses a non-int underlying type (or negative/large values), this can produce incorrect ordinals and lead to out-of-range writes in CreateReaders() (or huge array allocations). Consider switching to a conversion that is correct for all underlying enum types (e.g., based on Unsafe.SizeOf<T>() / Enum.GetUnderlyingType with non-boxing casts) and/or validating the configured keys up front before building the reader array.
| // Safety: all column enums in this codebase use int as underlying type. | |
| // Unsafe.As avoids the boxing that Convert.ToInt32(object) would cause. | |
| private static int EnumToInt(T value) => Unsafe.As<T, int>(ref value); | |
| // Convert enum value to int in a way that is safe for all underlying enum types. | |
| private static int EnumToInt(T value) => Convert.ToInt32(value); |
- Replace Unsafe.As<T, int> with size-dispatched helper that handles byte, short, and int-backed enums safely (JIT eliminates dead branches) - Move _cachedMaxOrdinal volatile write inside GetCachedMaxOrdinal so it only writes when not yet cached, avoiding unnecessary memory fence on every CreateSnapshot call Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude review |
|
Claude finished @smartprogrammer93's task in 1m 24s —— View job Re-review (Final Pass)
All issues from every previous review round have been addressed. Here is the complete status. Previously Critical — Fixed ✅1. private static int EnumToInt(T value)
{
if (Unsafe.SizeOf<T>() == sizeof(int)) return Unsafe.As<T, int>(ref value);
if (Unsafe.SizeOf<T>() == sizeof(byte)) return Unsafe.As<T, byte>(ref value);
if (Unsafe.SizeOf<T>() == sizeof(short)) return Unsafe.As<T, short>(ref value);
return Convert.ToInt32(value); // fallback for long-backed enums
}Correct for all enum underlying types. The JIT eliminates dead branches per instantiation — zero overhead at runtime. 2. Double-dispose guard [ if (Interlocked.Exchange(ref _disposed, 1) != 0) return;Clean and correct. Previously Moderate — Fixed ✅3. The write is now inside 4. Thread-safety on cached fields [ Both Previously Minor — Fixed ✅5.
6. Bounds check in if ((uint)ordinal >= (uint)_readers.Length || _readers[ordinal] is null)
throw new KeyNotFoundException(...);Efficient unsigned comparison. Consistent exception type. 7. ObjectDisposedException.ThrowIf(_disposed != 0, this);Correct per coding-style convention. 8.
9. Regression tests [
Both tests satisfy the project's "always add a regression test" requirement. Informational (Pre-existing, Not a Blocker)
SummaryAll critical, moderate, and minor issues across all four previous review rounds are resolved. The optimization is well-implemented, the profiling data clearly justifies it, and the code is correct. Ready to merge. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| _hintCacheMissOptions = readOptionsFactory(); | ||
| _hintCacheMissOptions.SetFillCache(false); | ||
| _options = options; | ||
| _hintCacheMissOptions = hintCacheMissOptions; |
There was a problem hiding this comment.
The new constructor that accepts pre-created ReadOptions doesn’t enforce the invariant that hintCacheMissOptions should have FillCache=false (the factory-based ctor always sets this). Since this ctor is public, it’s easy for a future caller to accidentally pass an options instance with FillCache=true, changing cache behavior. Consider setting hintCacheMissOptions.SetFillCache(false) inside this ctor (or validating/asserting it) so both constructors preserve the same semantics by default.
| _hintCacheMissOptions = hintCacheMissOptions; | |
| _hintCacheMissOptions = hintCacheMissOptions; | |
| _hintCacheMissOptions.SetFillCache(false); |
| // Non-boxing enum-to-int conversion. JIT eliminates dead branches at | ||
| // instantiation time, so this is zero-cost for any underlying type. | ||
| private static int EnumToInt(T value) | ||
| { | ||
| if (Unsafe.SizeOf<T>() == sizeof(int)) return Unsafe.As<T, int>(ref value); | ||
| if (Unsafe.SizeOf<T>() == sizeof(byte)) return Unsafe.As<T, byte>(ref value); | ||
| if (Unsafe.SizeOf<T>() == sizeof(short)) return Unsafe.As<T, short>(ref value); | ||
| return Convert.ToInt32(value); // fallback for long-backed enums | ||
| } |
There was a problem hiding this comment.
The comment above EnumToInt says “Non-boxing enum-to-int conversion”, but the Convert.ToInt32(value) fallback will box for most non-(byte/short/int)-sized enum underlying types due to overload resolution in generics. Consider tightening the comment (or adding explicit long/uint/ushort unsafe branches if truly aiming for non-boxing).
Ensures cross-platform visibility of the dispose flag, matching the Interlocked.Exchange write in Dispose(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Changes
ReadOptionsinstances (normal + cache-miss) across all 7 columnRocksDbReaderinstances inColumnDbSnapshot, instead of creating 14 separate onesReadOptionshandles viarocksdb_readoptions_destroy+GC.SuppressFinalizeinColumnDbSnapshot.Dispose(), with double-dispose guardDictionary<T, IReadOnlyKeyValueStore>with flat array indexed by enum ordinal viaConvert.ToInt32(safe for all enum underlying types)Func<ReadOptions>delegate across all column readers instead of 7 separate closuresColumnsDb<T>with volatile fields for thread safetyRocksDbReaderconstructor that accepts pre-createdReadOptionsfor shared useRoot cause:
ReadOptionsin RocksDbSharp has a finalizer (~ReadOptions()) but does not implementIDisposable. EachColumnDbSnapshotcreated 14 finalizableReadOptionsobjects (2 per column x 7 columns), plus aDictionary, 7RocksDbReaderinstances, and 7 closure delegates. In FlatState block processing, thousands of snapshots per second produce tens of thousands of finalizable objects that survive Gen0, get promoted to Gen1/Gen2, and trigger expensive stop-the-world GC collections.Profiling data (ContractCall_200 FlatState, 2000 block processing calls per round, 20 rounds):
Remaining Gen1/Gen2 pressure is from per-scope FlatState infrastructure objects (
FlatWorldStateScope,SnapshotBundle,ReadOnlySnapshotBundle,StateTree, etc.) that require deeper architectural changes to pool.Types of changes
Testing
Requires testing
If yes, did you write tests?
Notes on testing
Validated via ProfileRunner benchmark with per-round GC tracking. Existing tests should pass since the behavior is identical - same ReadOptions configuration, just shared instead of duplicated. The array-indexed lookup via
Convert.ToInt32is functionally equivalent to the dictionary lookup and safe for all enum underlying types.Documentation
Requires documentation update
Requires explanation in Release Notes